Collaborative Research Project
Review
Static maps with ggmap
Dynamic results presentation
- Static website hosting with gh-pages
20 November 2014
Collaborative Research Project
Review
Static maps with ggmap
Dynamic results presentation
Purpose: Pose an interesting research question and try to answer it using data analysis and standard academic practices. Effectively communicate your results to a variety of audiences in a variety of formats.
Deadline:
Presentation: In-class Friday 5 December
Website/Paper: 12 December
The project is a 'dry run' for your thesis with multiple presentation outputs.
Presentation: 10 minutes maximum. Engagingly present your research question and key findings to a general academic audience (fellow students).
Paper: 6,000 words maximum. Standard academic paper, properly cited laying out your research question, literature review, data and methods, and findings.
Website: An engaging website designed to convey your research to a general audience.
As always, you should submit one GitHub repository with all of the materials needed to completely reproduce your data gathering, analysis, and presentation documents.
Note: Because you've had two assignments already to work on parts of the project, I expect high quality work.
What is the data-ink ratio? Why is it important for effective plotting.
What is visual weighting?
Why should you avoid using the size of circles to mean anything in a plot?
How many decimal places should you report in a table?
Last class we didn't have time to cover mapping with ggmap.
We've already seen how ggmap can be used to find latitude and longitude.
library(ggmap)
places <- c('Bavaria', 'Seoul', '6 Parisier Platz, Berlin',
'Hertie School of Governance')
geocode(places)
## lon lat ## 1 11.49789 48.79045 ## 2 126.97797 37.56654 ## 3 13.37854 52.51701 ## 4 13.38921 52.51286
qmap(location = 'Berlin', zoom = 15)
Example from: Kahle and Wickham (2013)
Use crime data set that comes with ggmap
names(crime)
## [1] "time" "date" "hour" "premise" "offense" "beat" ## [7] "block" "street" "type" "suffix" "number" "month" ## [13] "day" "location" "address" "lon" "lat"
# find a reasonable spatial extent
qmap('houston', zoom = 13) # gglocator(2) see in RStudio
# only violent crimes
violent_crimes <- subset(crime,
offense != "auto theft" & offense != "theft" &
offense != "burglary")
# order violent crimes
violent_crimes$offense <- factor(violent_crimes$offense,
levels = c("robbery", "aggravated assault", "rape", "murder"))
# restrict to downtown
violent_crimes <- subset(violent_crimes,
-95.39681 <= lon & lon <= -95.34188 &
29.73631 <= lat & lat <= 29.78400)
# Set up base map
HoustonMap <- qmap("houston", zoom = 14,
source = "stamen", maptype = "toner",
legend = "topleft")
# Add points
FinalMap <- HoustonMap +
geom_point(aes(x = lon, y = lat, colour = offense,
size = offense),
data = violent_crimes) +
guides(size = guide_legend(title = 'Offense'),
colour = guide_legend(title = 'Offense'))
print(FinalMap)
When your output documents are in HTML, you can create interactive visualisations.
Potentially more engaging and could let users explore data on their own.
Big distinction:
Client Side: Plots are created on the user's (client's) computer. Often JavaScript in the browser. You simply send them static HTML/JavaScript needed for their browser to create the plots.
Server Side: Data manipulations and/or plots (e.g. with Shiny Server) are done on a server.
There are lots of free services (e.g. GitHub Pages) for hosting webpages for client side plot rendering.
You usually have to use a paid service for server side data manipulation plotting.
You already know how to create HTML documents with R Markdown.
Set your code chunk to results='asis'.
There are a growing set of tools for interactive plotting:
These packages simply create an interface between R and JavaScript.
Debugging often requires some knowledge of JavaScript and the DOM.
In sum: usually simple, but can be difficult.
You can use the googleVis package to create Google plots from R.
Example from googleVis Vignettes.
# Create fake data
fake_compare <- data.frame(
country = c("US", "GB", "BR"),
val1 = c(10,13,14),
val2 = c(23,12,32))
library(googleVis) line_plot <- gvisLineChart(fake_compare) print(line_plot, tag = 'chart')
Note: Uses `results='asis' in the code chunk head.
To show the in R use plot instead of print and don't include tag = 'chart'.
library(WDI)
co2 <- WDI(indicator = 'EN.ATM.CO2E.PC', start = 2010, end = 2010)
co2 <- co2[, c('iso2c','EN.ATM.CO2E.PC')]
# Clean
names(co2) <- c('iso2c', 'CO2 Emmissions per Capita')
co2[, 2] <- round(log(co2[, 2]), digits = 2)
co2_map <- gvisGeoMap(co2, locationvar = 'iso2c',
numvar = 'CO2 Emmissions per Capita',
options = list(
colors = '[0xfff7bc, 0xfec44f,
0xd95f0e]'
))
Note: That 0x replaces # for hexadecimal colors.
CO2 Emmissions (metric tons per capita)
print(co2_map, tag = 'chart')
Note: you will need to view googleVis maps that are in R Markdown documents in your browser rather than RStudio's built in HTML viewer.
More examples are available at: http://hertiedatascience2014.github.io/Examples/
Any file called index.html in a GitHub repository branch called gh-pages will be a hosted website.
The URL will be:
http://GITHUB_USER_NAME.github.io/REPO_NAME
Note: you can use a custom URL if you own one. See https://help.github.com/articles/setting-up-a-custom-domain-with-github-pages/
First create a new branch in your repository called gh-pages:
Then sync your branch with the local version of the repository.
Finally switch to the gh-pages branch.
You can use R Markdown to create the index.html page as before.
Simply place a new .Rmd file in the repository called index.Rmd and knit it to HTML. Then push it back up.
Your website will now be live.
Everytime you push to the gh-pages branch, the website will be updated.
Begin to create a website for your project with static and interactive graphics.
If relevant include:
A table of key results
A googleVis map
A bar or line chart with googleVis or other
A simulation plot created with Zelig showing key results from your regression analysis.
Push to the gh-pages branch.